University of New Hampshire ECE 824 Final Project
Authors: Colin Cambo, Austin Smith
NOTE: THIS DOCUMENT IS AN INTERACTIVE JUPYTER NOTEBOOK, IT CONTAINS ALL CODE NECESSARY TO BE ABLE TO REPRODUCE OUR RESULTS. THROUGHOUT THIS PAPER YOU WILL SEE CODE EXAMPLES FOR HOW EACH STEP WAS EXECUTED, AND AN EXPLANATION OF THE CODE IN ITALICS.
For more information on Jupyter Notebooks please check <a href = 'http://jupyter.org/about.html'>here.</a>
<a id = '0'></a>
The movement towards creating smart cities and smart campuses has been a growing trend globally and an interesting body of research within the Ubiquitous Computing field of research. The idea being that managing information systems and installing smart technology can help solve problems or areas which need improving on a college campus or in a city. Some of these issues include:
"How can we create a safe environment for students?" "Where are students spending the most time and how can access to those spaces be managed more effectively?" "Where can more WIFI access points be added to improve internet access to students?" "Where and when will students be most likely to see a mass advertising or important message from the university?"
The list of improvements university administrators would like to make is immense and the idea of smart campus technology is to use networked technology in the background to help streamline and improve the efficiency of the campus. The problem with the smart campus concept is that adding in these technologies and systems is complex and costly for the university. Especially given how much utility such systems may provide. Often times when looking at the cost benefit analysis of many of these technologies, while the results are impressive, the systems are not worth the cost.
In response to this issue, we considered the idea that maybe our university, (University of New Hampshire) may be able to make use of the existing technology infrastructure to get an idea of how students travel around campus. The idea being that tracking connection times to various access points on campus would allow the university to have a system for following student movement patterns and understanding how the student body uses the campus as a whole.
import pandas as pd
from datetime import datetime
from collections import defaultdict, Counter
import matplotlib.pyplot as plt
import pandas as pd
from datetime import datetime
from collections import defaultdict, Counter
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.patches as mpatches
from ipywidgets import widgets
from IPython.display import display, clear_output
from ipywidgets import interact, interactive, fixed
from matplotlib import colors as colors_
from mpl_toolkits.basemap import Basemap
%matplotlib inline
<a id = '1'></a>
<a id = '1.1'></a>
Through working with the University of New Hampshire Information Technology department, we were able to secure data for a weeks worth of historical data. This dataset contained data detailing every single connection to the university wireless network in that time frame. The following variables were included:
The data we were given was split up into 14 seperate csv's. For each day there was a csv for residential building connections and a csv for all other buildings.
Randomizing Data
In order to keep this data secure and anonymous we decided to alter some of the data. We replaced the week's date we were given with a random week so that no one could find their own connections, we made sure to make the weekday's line up though so general insights can still be gathered. We also decided to replace the last 6 hex digits of everyone's MAC Addresses with an assigned number so that they don't represent people's actual MAC Addresses. We chose to assign a number between 0x000000 - 0xFFFFFF instead of hashing because hashing the last six digits grew the size of the string enormously and we believe this serves the same function of keeping the users MAC Addresses anonymous. The last thing we did to the data was remove the access point description's so now you'll only know when someone connects to a building not an individual access point. This helped us to save space and provides another layer of security so that it's very difficult to know what access point is in each room.
The code below reads in two sample datasets
#Converters for standardizing data being read in
convert_time = lambda x: x.replace('+00', '')
convert_mac = lambda x: x.replace(':', '').upper()
auth_df = pd.read_csv(r'.\data\WiFi_Data\randomized_auth_2015-09-19.csv',names=['MAC_Address', 'Time', 'Access_Point'],
converters={'Time':convert_time, 'MAC_Address':convert_mac})
xt_auth_df = pd.read_csv(r'.\data\WiFi_Data\randomized_xt_auth_2015-09-19.csv',names=['MAC_Address', 'Time', 'Access_Point'],
converters={'Time':convert_time, 'MAC_Address':convert_mac})
For illustrative purposes, the data shown below is the first ten rows of the data from both datasets. The next steps were to work with these variables and create a few additional variables to make the data more useful.
print('This is the first ten rows of the residential data set!')
print(auth_df.head(10))
print('Length: ', len(auth_df))
print('This is the first ten rows of the non-residenial data set!')
print(xt_auth_df.head(10))
print('Length: ', len(xt_auth_df))
We decided to merge the data into a single DataFrame and add some new columns to make it easier for us to use. Below is the code that merges the 14 files into a single DataFrame and adds three new columns.
Three new colums as follows:
'Weekday':This column is the day of the connection time (0-Monday, 6-Sunday) taken out of the main 'Time' column, this allows for a bit easier filtering by date/time.
NOTE: Running the code below might take awhile.
wifi_df = pd.DataFrame() # Initialize empty DataFrame
for i in range(19,26):
auth_df = pd.read_csv(r'.\Data\WiFi_Data\randomized_auth_2015-09-'+str(i)+'.csv',
names=['MAC_Address', 'Time', 'Access_Point'],
converters={'Time':convert_time, 'MAC_Address':convert_mac})
xt_auth_df = pd.read_csv(r'.\Data\WiFi_Data\randomized_xt_auth_2015-09-'+str(i)+'.csv',
names=['MAC_Address', 'Time', 'Access_Point'],
converters={'MAC_Address':convert_mac})
#Concatenating DataFrame's
wifi_df = pd.concat([wifi_df, auth_df, xt_auth_df])
#Creating datetime columns
datetimes = [(datetime.strptime(str(t), '%Y-%m-%d %H:%M:%S')) for t in wifi_df['Time'].tolist()]
hours = [t.hour for t in datetimes]
day_of_week = [t.weekday() for t in datetimes]
minutes = [t.minute for t in datetimes]
wifi_df['Hours'] = hours
wifi_df['Weekday'] = day_of_week
wifi_df['Minutes'] = minutes
wifi_df.head()
<a id = '1.2'></a> Since we were given MAC Addresses with our dataset we decided it would be beneficial to match up each MAC Address with the corresponding company it belongs to so that we can better guess what the device happens to be. We were able to locate a nicely formatted csv of MAC Addresses and their company from someones github account (https://github.com/TakahikoKawasaki/nv-oui/blob/master/data/oui.csv). With this new file we decided to append the MAC Address Company.
df = pd.read_csv(r'.\Data\oui.csv')#Reading in MAC Address Company csv as DataFrame
oui_dict = {i:df.ix[c,2] for c, i in enumerate(df.ix[:,1].values)}#Converting DataFrame above to dictionary for O(1) lookup time
def find_device(mac):
"""
Returns company registered to MAC Address
Keyword Argument:
mac -- MAC Address Prefix (6 Digit Hexadecimal string)
"""
try:
return oui_dict[mac]
except:
return 'UNKNOWN'
The code below runs the find_device function for every MAC Address in our dataset and attaches the companies to our dataset.
company = [find_device(str(r)[:6]) for r in wifi_df['MAC_Address'].tolist()]
wifi_df['MAC_Company'] = company
Below is the first five rows of the data with the new columns added to it.
wifi_df.head()
#Makes it easy to convert numbers back to weekday names
weekday_dict = {0:'Monday', 1:'Tuesday', 2:'Wednesday', 3:'Thursday', 4:'Friday', 5:'Saturday', 6:'Sunday'}
unique_mac_cnt = Counter(wifi_df.drop_duplicates(['MAC_Address'])['MAC_Company'].tolist())#Counts # of unique company addresses
top_10_companies = sorted(list(unique_mac_cnt.items()), key= lambda x: x[1], reverse=True)[:10]#Converts top ten to list
Plotting a simple bar graph of the unique MAC Addresses for each company reveals to us that Apple is the overwhelming device of choice on campus.
plt.bar(range(10), [x[1] for x in top_10_companies])
plt.xticks(range(10), [x[0] for x in top_10_companies], rotation=90)
plt.xlim([0,10])
plt.title('Unique MAC Addresses Per Company On Campus')
plt.ylabel('Number of Unique MAC Addresses')
plt.xlabel('Company')
As you can see from the bar graph, Apple comprises 25,000+ unique devices while the next largest company had around 4,000. Upon discovering this, we made the decision to focus solely on Apple devices as they are more likely to be portable devices such as phones or laptops. While there may be some other devices like Apple TV's and desktop computers, we felt as though focusing solely on apple devices was the best way to capture students movements without too much noise from network devices which were likely not traveling with students.
The code below filters the DataFrame for only rows that have 'MAC_Company' equal to 'Apple, Inc.'
wifi_df = wifi_df[wifi_df['MAC_Company']=='Apple, Inc.'].sort_values(by=['Time'])#Selecting only Apple devices
wifi_df = wifi_df.reset_index(drop=True)#Resetting index
<a id = '1.3'></a>
The University of New Hampshire IT department provided us with the csv "Access_Locations.csv" which contains the access point number and the building that access point happens to be in. In order for us to work with the data more easily we added a column to our dataset that consists of which building each connection takes place in.
The code below reads in the "Access_Locations.csv" dataset as a DataFrame then iterates through the DataFrame adding the access points and their corresponding building to a dictionary.The dictionary is then used to generate a new column "Building" from every row in the original wifi_df dataset.
aploc = pd.read_csv(r'.\data\Access_Locations.csv', header=0)
building_dict = defaultdict(lambda: 'Unknown')
for row in aploc.iterrows():
building_dict[row[1]['access_point']] = row[1]['building']
buildings_add = [building_dict[key] for key in wifi_df['Access_Point'].tolist()]
wifi_df['Building'] = buildings_add
wifi_df = wifi_df[wifi_df['Building']!='Unknown'].reset_index(drop=True)
We noticed that when tracking an individual user, we would get a dataset showing the following:
mac_track_df = wifi_df[wifi_df.MAC_Address=='38CADA000001'].reset_index(drop=True)
mac_track_df.head(10)
This shows us that a majority of connections are just from someone switching access points in the same building, which doesn't give us very much information on their movements on a macro level.
<a id = '1.4'></a>
To solve the problem of repeating building rows, we chose to eliminate the movements within a building, since we are mostly concerned with how students are traveling around campus. We eliminated the inter-building movements by only keeping the first connection in each building.
The section of data shown below illustrates what the dataset looks like after focusing only on unique connections. We removed the non-informative data points and added a variable called 'Time_Since_Last_Connect' which is the time in minutes since the MAC Address connected to their last building. This allows us to have an idea of how many hours elapsed between connections from one building to the next.
mac_list = wifi_df['MAC_Address'].tolist()
building_list = wifi_df['Building'].tolist()
unique_checker = [0]*len(wifi_df)
mac_dict = defaultdict(lambda: 'None')
for i, user in enumerate(mac_list):
if mac_dict[user]=='None':
mac_dict[user] = building_list[i]
unique_checker[i] = 1
elif mac_dict[user]!=building_list[i]:
unique_checker[i] = 1
mac_dict[user] = building_list[i]
wifi_df['Check'] = unique_checker
wifi_df = wifi_df[(wifi_df.Check==1) & (wifi_df.Building!='0')].reset_index(drop=True)
wifi_df = wifi_df.drop('Check', 1)
last_time_dict = {}
last_time_list = [0]*len(wifi_df)
time = wifi_df['Time'].tolist()
for i, row in enumerate(wifi_df['MAC_Address'].tolist()):
if row not in last_time_dict:
last_time_list[i] = 0
last_time_dict[row] = time[i]
else:
last_time_list[i] = int((datetime.strptime(time[i],
'%Y-%m-%d %H:%M:%S') - datetime.strptime(last_time_dict[row],
'%Y-%m-%d %H:%M:%S')).total_seconds()/60)
last_time_dict[row] = time[i]
wifi_df['Time_Since_Last_Connect'] = last_time_list
wifi_df = wifi_df.dropna().reset_index(drop=True)
mac_track_df = wifi_df[wifi_df.MAC_Address=='38CADA000001'].reset_index(drop=True)
mac_track_df.head()
In order to be able to plot the buildings and use them in visualizations, we needed to create a table containing coordinates in latitude an longitude. We used satelite imagery of the campus and a map to manually create this table. The first five rows of which are shown below.
unh_buildings = pd.read_csv(r'.\Data\Building_Locations.csv')
unh_buildings.head()
list_buildings = unh_buildings['building_names'].tolist()
#Dictionary with building as key and a tuple of latitude and longitude as value
building_coords = {row[1][0]:(row[1][1], row[1][2]) for row in unh_buildings.iterrows()}
<a id = '2'></a>
The next step to our analysis was to run some descriptive statisticst in order to get a better understanding of the data.
<a id = '2.1'></a> After cleaning the data, we ended up with 1,020,992 connections over the course of the week.
len(wifi_df)
Next we looked at what the top ten most frequently connected to access points on campus were. As shown below, we have the most commonly accessed access points on campus. The list below shows the top ten access points in order of most connections.
most_connected_ap = sorted(list(Counter(wifi_df['Access_Point'].tolist()).items()), key= lambda x:x[1], reverse=True)[:10]
for ap in most_connected_ap:
print('Access Point: {} Building: {} Connections: {}'.format(ap[0], building_dict[ap[0]], ap[1]))
def plot_top_paths(building, days, hour_range, percent=False):
"""
Plots bar graph top 10 destinations from specified buildings at the specified time/day.
Keyword Arguments:
building -- name of building (string)
days -- list of days interested in
hour_range -- tuple of start and end hours
percent -- Displays percentage on y-axis if True, raw count if false (boolean)
"""
hours = list(range(hour_range[0],hour_range[1]+1))
my_data = wifi_df[(wifi_df.Weekday.isin(days)) & (wifi_df.Hours.isin(hours))]
paths = {b:defaultdict(int) for b in list_buildings}
for b in list_buildings:
for c in list_buildings:
if c != b:
paths[b][c] = 0
mac_add = {x:'0' for x in my_data['MAC_Address'].tolist()}
mac_list = my_data['MAC_Address'].tolist()
build_list = my_data['Building'].tolist()
for i in range(len(mac_list)):
if mac_add[mac_list[i]] == 0:
mac_add[mac_list[i]] = build_list[i]
else:
try:
paths[mac_add[mac_list[i]]][build_list[i]] += 1
except:
pass
mac_add[mac_list[i]] = build_list[i]
total = sum([x[1] for x in paths[building].items()])
top_paths = sorted(list(paths[building].items()), key= lambda x:x[1], reverse=True)
plot_x, plot_y = [], []
for i, path in enumerate(top_paths):
if percent==True:
plot_y.append(round((path[1]/total),2)*100)
else:
plot_y.append(path[1])
plot_x.append(path[0])
if i == 10:
break
ax = plt.bar(range(len(plot_x)), plot_y, align='center')
plt.xticks(range(len(plot_x)), plot_x, rotation=90)
plt.xlim([0,len(plot_x)])
plt.xlabel('Buildings Travelled to')
if percent==True:
plt.ylabel('Percent of Connections')
else:
plt.ylabel('Number of Connections')
plt.title('Top Paths From '+ str(building)+ ' On '+', '.join([(weekday_dict[b]) for b in days])+' For Hours '+
str(hour_range[0]) +' to '+str(hour_range[1]))
return ax
<a id = '2.2'></a> After performing some descriptive statistics, we started to visualize paths and devise a way to see what buildings students were traveling to from a given building.
Plotting the top paths from a building can be very informative for understanding how students move throughout campus. The histogram below is from a function we created and has the following parameters:
This histogram shows the next building that students who were in Kingsbury Hall on Saturday from 1 AM to 1 PM were travelling to.
plot_top_paths('Kingsbury', [5], (0,12), percent=True)
def plot_connections(building, day, color='Black'):
"""
Plots scatter plot of unique building connections over specified day.
Keyword arguments:
building -- name of building (string)
day -- day of the week (int)
color -- color of points on graph, default Black
"""
df = wifi_df[(wifi_df['Building'].str.contains(building)==True) & (wifi_df['Weekday']==day)]
c = Counter(df['Hours'].tolist())
ax = plt.scatter(range(len(c)), list(c.values()), color=color)
plt.xlim([0, 23])
plt.xlabel('Hour of Day')
plt.ylabel('Number of Unique Connections')
plt.title('Connections For '+ str(building))
return ax
<a id = '2.3'></a> We also created an easy function to show the amount of unique connections within a building over the course of the day. Because this function returns a matplotlib.scatter we can easily call the function multiple times and stack their results for an easy comparison between buildings or days.
Below is two scatterplots created with this function, the first one's parameters are:
The second one's parameters are:
With this simple function we are able gather tons of information and we can answer simple questions such as "What weekday is the busiest for a building?" or "What are the peak hours for the dining halls?"
d = plot_connections('Kingsbury', 0, 'Blue')
e = plot_connections('Kingsbury', 4, 'Red')
plt.legend([d, e], ['Monday', 'Friday'])
plt.show()
def get_plot_paths(day, hour):
"""
Returns nested dictionary of all building paths and # of people who took path for specified day and hour.
Keyword arguments:
day -- day of the week (list(int))
hour -- hours of day (list(int))
"""
my_paths = wifi_df[(wifi_df.Weekday.isin(day)) & (wifi_df.Hours.isin(hour))]
paths = {b:defaultdict(int) for b in list_buildings}
for b in list_buildings:
for c in list_buildings:
if c != b:
paths[b][c] = 0
mac_add = {x:'0' for x in wifi_df['MAC_Address'].tolist()}
mac_list = my_paths['MAC_Address'].tolist()
build_list = my_paths['Building'].tolist()
for i in range(len(mac_list)):
if mac_add[mac_list[i]] == 0:
mac_add[mac_list[i]] = build_list[i]
else:
try:
paths[mac_add[mac_list[i]]][build_list[i]] += 1
except:
pass
mac_add[mac_list[i]] = build_list[i]
return paths
# To understand what the above function is returning uncomment out the line below
#print(get_plot_paths([5], [1, 2, 3, 4, 5, 6, 7, 8]))
my_colors = 'blue red green yellow purple black orange white teal crimson cyan brown gray hotpink lavendar'.split()
def return_all_connections(days, hours):
"""
Returns dictionary with buildings as keys and # of unique connections in building at specified time/day as values
Keyword arguments:
days -- days of the week (list(int))
hours -- hours of each day (list(int))
"""
my_data = wifi_df[(wifi_df.Weekday.isin(days)) & (wifi_df.Hours.isin(hours))]
building_count = Counter(my_data['Building'].tolist())
return dict(building_count.items())
def plot_path_lines(days, hour_range, building, heat_map=False):
plt.figure(figsize=(12,16))
hours = list(range(hour_range[0],hour_range[1]+1))
path_dict = get_plot_paths(days, hours)
color = 'blue red green yellow purple black orange white teal'.split()
if len(building)>8:
print("Too many buildings selected! Can only plot 8")
return
m=Basemap(projection='merc',
llcrnrlon=-70.931135416,
llcrnrlat=43.1341063997,
urcrnrlon=-70.9164369106,
urcrnrlat=43.149489287,
resolution='l',
epsg=4236)
m.drawmapboundary(fill_color='#F5F5F5', linewidth=0)
m.arcgisimage(service='World_Street_Map', xpixels=1000, verbose= False)
#Offsets necessary to plot Google Maps coordinates on our ArcGIS map
x_offset = 0.0095
y_offset = 0.004
total=0
dot_scale = 0
if heat_map==True:
heat_dict = return_all_connections(days, hours)
total = sum([val[1] for val in list(heat_dict.items())])
dot_scale = 40/(max([val[1] for val in list(heat_dict.items())])/total)
for i in range(len(unh_buildings)):
x, y = m(unh_buildings.ix[i,2]+x_offset, unh_buildings.ix[i,1]+y_offset)
if heat_map==True:
try:
m.plot(x, y, 'o', markersize=dot_scale*(heat_dict[unh_buildings.ix[i,0]]/total), color='#444444', alpha=0.6)
except:
pass
else:
m.plot(x, y, 'o', markersize=5, color='#444444', alpha=0.6)
if heat_map==False:
for i, b in enumerate(building):
x, y = m(building_coords[b][1]+x_offset, building_coords[b][0]+y_offset)
m.plot(x, y, 'o', markersize=5, color=my_colors[i], alpha=0.8)
total = 0
for b in building:
for key in unh_buildings['building_names'].tolist():
total+=path_dict[b][key]
building_max = 0
for c in building:
new_max = max([b[1] for b in list(path_dict[c].items())])
if building_max < new_max:
building_max = new_max
alpha_scale = 1/building_max
for i, b in enumerate(building):
for key in path_dict[b]:
line_width = (path_dict[b][key]/total)*50*len(building)
#alpha = (path_dict[b][key]/total)*10*len(building)
try:
lonlist = [building_coords[key][1]+x_offset, building_coords[b][1]+x_offset]
latlist = [building_coords[key][0]+y_offset, building_coords[b][0]+y_offset]
x, y = m(lonlist,latlist)
m.plot( x, y, color=my_colors[i], lw=line_width, alpha=alpha_scale*path_dict[b][key])
except:
pass
plt.title('Connections From '+str(', '.join(building)) + ' For Days '+ ', '.join([(weekday_dict[day]) for day in days]) \
+ ' For Time Range ' + str(hour_range[0]) + ' to ' + str(hour_range[1]))
patches = []
for i in range(len(building)):
patches.append(mpatches.Patch(color=my_colors[i], label=building[i]))
plt.legend(handles=patches)
else:
plt.title('Heat Map of Connections For Days '+ ', '.join([(weekday_dict[day]) for day in days])+ ' For Time Range ' \
+ str(hour_range[0]) + ' to ' + str(hour_range[1]))
print('Biggest dot is '+ str(max([val[1] for val in list(heat_dict.items())]))+ ' Unique Connections')
<a id = '2.4'></a> From the onset of this project, we realized that much of this data would be most interesting to see in the form of a plotted visualization, so we queried ArcGIS's database to get a detailed map of the UNH campus and overlayed the building points on top of it. From there we were able to create some very interesting visualizations that helped us to visualize what was going on around campus.
To get a detailed map of the UNH campus we utilized ArcGIS server to download a high-res photo of campus, and then we overlayed our points on top. This was definitely one of the more challenging parts of the project, especially because the coordinates we gathered for every building happened to be slightly off from the ArcGIS image. This is due to the way ArcGIS and Google use different map projections that will differ on a micro level. You can read more about this issue here. To correct for this issue we added a x and y offset to every point.
<a id = '2.4.1'></a> For our first vizualization, we created a heatmap of the campus where each building is plotted as a dot. The size of each dot varies based on the number of unique connections in the building. In this particular plot, the 'Holloway Commons' has the most unique connections between noon and 4 PM (2,918 total connections).
plot_path_lines([0], (12,16), [], heat_map=True)
<a id = '2.4.2'></a> Next, we plotted where students were travelling to within the same time range of noon to 4 PM. The plot below shows where students who were in 'Dimond Library', 'Kingsbury', and 'Stoke' connected next. The thickness and opacity of the lines shows the number of students, the thicker the line is, the greater number of students who have travelled that particular path.
plot_path_lines([5], (12,16), ['Dimond Library', 'Kingsbury', 'Stoke'], heat_map=False)
<a id = '2.4.3'></a> We found these results interesting, however, we wanted to track more than just where they went from one location, but more of where they went after departing a specific location. That's where we got the idea of plotting iterative building paths. When plotting the paths of students iteratively you're able to learn much more about their movements on campus.
<a id = '2.4.3.1'></a> The next group of plots track students originating from 'Philbrook Hall'. The first iteration shows where students travelled from 'Philbrook Hall', again with the thickness of the line indicating how many students follwoed that path. The second iteration shows where those students went from that hall, essentially showing their second stop. Finally, the third iteration, shows their third stop from their second stop. This is a very powerful map as it allows us to visualize how students travel around campus given where they start from.
def get_people_paths(day, hour, building):
my_paths = wifi_df[(wifi_df.Weekday.isin(day)) & (wifi_df.Hours.isin(hour))]
my_people_paths = my_paths[my_paths.Building == building]
people_list = my_people_paths['MAC_Address'].tolist()
my_paths2 = my_paths[my_paths.MAC_Address.isin(people_list)]
full_people_list = my_paths2['MAC_Address'].tolist()
connection_building = my_paths2['Building'].tolist()
people_connections = {mac:0 for mac in people_list}
for i in range(len(my_paths2)):
if people_connections[full_people_list[i]] != 0:
people_connections[full_people_list[i]].append(connection_building[i])
elif building == connection_building[i]:
people_connections[full_people_list[i]] = [building]
return people_connections
def plot_iterative_path_lines(days, hour_range, building, iterations, start_num='all', max_buildings=15, path_num=15):
build_num = start_num
building_label, last_buildings = [], [building]
hours = list(range(hour_range[0],hour_range[1]+1)) #Converting tuple range to list of hours
people_paths = get_people_paths(days, hours, building) #Getting the people's paths at the specified parameters
mac_list = people_paths.keys()
all_paths = []
for b in unh_buildings.ix[:,0]:
for c in unh_buildings.ix[:,0]:
all_paths.append([b, c])
for it in range(1, iterations+1):
plt.figure(figsize=(12,16))
if build_num!='all':
path_dict = {(a[0],a[1]):0 for a in all_paths}
else:
path_dict = {b:{c:0 for c in unh_buildings.ix[:,0]} for b in unh_buildings.ix[:,0]}
m=Basemap(projection='merc',
llcrnrlon=-70.931135416,
llcrnrlat=43.1341063997,
urcrnrlon=-70.9164369106,
urcrnrlat=43.149489287,
resolution='l',
epsg=4236)
m.drawmapboundary(fill_color='#F5F5F5', linewidth=0)
m.arcgisimage(service='World_Street_Map', xpixels=1000, verbose= False)
#Offsets necessary to plot Google Maps coordinates on our ArcGIS map
x_offset = 0.0095
y_offset = 0.004
for i in range(len(unh_buildings)):
x, y = m(unh_buildings.ix[i,2]+x_offset, unh_buildings.ix[i,1]+y_offset)
m.plot(x, y, 'o', markersize=5, color='#444444', alpha=0.6)
total = 0
for mac in mac_list:
if len(people_paths[mac])>it:
try:
if build_num!='all':
path_dict[(people_paths[mac][it-1], people_paths[mac][it])]+=1
total+=1
else:
path_dict[people_paths[mac][it-1]][people_paths[mac][it]]+=1
total+=1
except:
pass
scale = 50
patches = []
label_dict, color_dict = {}, {}
if build_num !='all':
path_list = []
for b in unh_buildings.ix[:,0]:
for c in unh_buildings.ix[:,0]:
try:
if path_dict[(b,c)] > 0:
if b in last_buildings:
path_list.append((b, c, path_dict[(b,c)]))
except:
print('Error here?: ',path_dict[(b,c)])
top_paths = sorted(path_list, key=lambda x:x[2], reverse=True)
if it==1:
top_paths = top_paths[:int(build_num)]
else:
top_paths = top_paths[:path_num]
top_buildings = [x[0] for x in top_paths]
count = 0
for b in top_buildings:
if b not in color_dict:
color_dict[b] = my_colors[count]
count+=1
total = sum([x[2] for x in top_paths])
scale = int(build_num)*3
new_path_dict = {(b[0],b[1]):b[2] for b in top_paths}
last_buildings = [x[1] for i, x in enumerate(top_paths) if i<max_buildings]
for i, key in enumerate(new_path_dict.keys()):
line_width = (new_path_dict[key]/total)*scale
try:
lonlist = [building_coords[key[0]][1]+x_offset, building_coords[key[1]][1]+x_offset]
latlist = [building_coords[key[0]][0]+y_offset, building_coords[key[1]][0]+y_offset]
x, y = m(lonlist, latlist)
#alpha = (new_path_dict[key]/total)*(build_num/2)
m.plot( x, y, color=color_dict[key[0]], lw=line_width, alpha=.8)#alpha)
label = 'Path From '+str(key[0])
if label in label_dict:
pass
else:
label_dict[label] = color_dict[key[0]]
patches.append(mpatches.Patch(color=color_dict[key[0]], label=label))
except:
pass
plt.legend(handles=patches)
else:
new_path_dict = {x:{y:z for y,z in path_dict[x].items() if y and z!=0 and z} for x,y in path_dict.items() if x}
for i, b in enumerate(new_path_dict.keys()):
for key in new_path_dict[b]:
line_width = (new_path_dict[b][key]/total)*scale#*it #*len(path_dict.keys())
lonlist = [building_coords[key][1]+x_offset, building_coords[b][1]+x_offset]
latlist = [building_coords[key][0]+y_offset, building_coords[b][0]+y_offset]
x, y = m(lonlist, latlist)
m.plot( x, y, color='Black', lw=line_width, alpha=.8)
plt.title('Originating From: '+str(building) +' Iteration: '+str(it)+' With '+str(total)+' Connections')
plot_iterative_path_lines([5], (12,16), 'Philbrook', 3, start_num='all')
<a id = '2.4.3.2'></a> Due to the chaos-ness of this last visualization we decided to edit the function to accept a few additional arguments that will help narrow the scope of what we're looking at, and color code the lines accordingly. This will help people to draw insight from the plot faster.
The arguments we added were the number of starting lines from the building, the maximum number of buildings to show the paths for, and an argument that will plot how many lines you want after the first iteration.
The following plot below has the following arguments:
plot_iterative_path_lines([5], (12,16), 'Philbrook', 4, start_num=10, max_buildings=8, path_num=30)
With these additional arguments we are able to easily trace the most common paths from any building and follow where they're going next for additional iterations.
This series of iterations does a great job of illustrating a Friday in the life of a student who is likely eating breakfast/lunch at 'Philbrook', which is a dining hall. The first iteration seems to show students leaving Philbrook, travelling to various academic buildings (Kingsbury, Parsons, etc..), and dormitories (Christensen, Williamson, Haaland, etc..).
The second iteration is then showing that many people who went to these dormitories/ academic buildings actually end up going back to Philbrook, or at least come close enough to connect ot their WiFi.
The third iteration again shows that most people are connected to Philbrook and they're now overwhelmingly going back to their dorms.
The fourth iteration then shows that once again many people are connecting to Philbrook.
This is a very interesting plot because there could be a lot going on underneath the surface that one might not be aware of if they're not familiar with UNH campus. Rather than this original group of people eat 3 times within 4 hours they're most likely connecting to the Philbrook dining hall when walking by it, because the path from Christensen and Williamson to campus happens to touch Philbrook.
Another possibility for this phenomenon could be that the access points aren't on the same time so when sorting our entries by time they won't be in the proper order. This is something we would like to investigate further, but we haven't got the time at the moment.
<a id = '3'></a> In order for these plots to be accessible to everyone for gaining insight we decided to make them interactable.
<a id = '3.1'></a> From this interactive interface one is able to quickly compare where most people are going from two buildings at a specified day and time.
On the plot 1 tab, select the building you're interested in from the dropdown. Click the small boxes next to the day's of the week you're interested in, and then drag the time slider to reflect a time range you want to look at. Repeat these steps for plot 2 on the plot 2 tab and select at the button checkbox if you want the results in percentages or a raw count.
The following example below is comparing Philbrook and Stilings on Monday and Wednesday between the hours of 7 and 10. From this chart we are not only able to see that Philbrook has more morning traffic on these days, but we're also able to see that most people's next stop is close to the dining hall they're eating at.
def plot_both_top_paths(b):
clear_output(wait=True)
weekday_list1 = [i for i in range(7) if top_path_hbox2.children[i+1].value==True]
weekday_list2 = [i for i in range(7) if top_path_hbox6.children[i+1].value==True]
plt.figure(figsize=(16,12))
plt.subplot(2, 2, 1)
plot_top_paths(top_path_building_dropdown1.value, weekday_list1, top_path_time_slider1.value,
percent=top_path_percent_checkbox.value)
plt.subplot(2, 2, 2)
plot_top_paths(top_path_building_dropdown2.value, weekday_list2, top_path_time_slider2.value,
percent=top_path_percent_checkbox.value)
plt.show()
top_path_building_text1 = widgets.Latex(value='Select Building One:', width='20%')
top_path_building_dropdown1 = widgets.Dropdown(options = list_buildings, height='25px')
top_path_hbox1 = widgets.HBox(children=[top_path_building_text1, top_path_building_dropdown1], width='100%', height='50px')
top_path_weekday_text1 = widgets.Latex(value='Day of Week:', width='10%')
top_path_monday_checkbox1 = widgets.Checkbox(description = 'Monday: ', value=False, width='10%')
top_path_tuesday_checkbox1 = widgets.Checkbox(description = 'Tuesday: ', value=False, width='10%')
top_path_wednesday_checkbox1 = widgets.Checkbox(description = 'Wednesday:', value=False, width='10%')
top_path_thursday_checkbox1 = widgets.Checkbox(description = 'Thursday: ', value=False, width='10%')
top_path_friday_checkbox1 = widgets.Checkbox(description = 'Friday: ', value=False, width='10%')
top_path_saturday_checkbox1 = widgets.Checkbox(description = 'Saturday: ', value=False, width='10%')
top_path_sunday_checkbox1 = widgets.Checkbox(description = 'Sunday: ', value=False, width='10%')
top_path_hbox2 = widgets.HBox(height='40px',width='100%')
top_path_hbox2.children = [top_path_weekday_text1, top_path_monday_checkbox1, top_path_tuesday_checkbox1, top_path_wednesday_checkbox1,
top_path_thursday_checkbox1, top_path_friday_checkbox1, top_path_saturday_checkbox1, top_path_sunday_checkbox1]
top_path_hour_text1 = widgets.Latex(value='Enter Time Range:', width='20%')
top_path_time_slider1 = widgets.IntRangeSlider(min=0,max=23,step=1,value=(0,23), width='80%')
top_path_hbox3 = widgets.HBox(height='40px',width='100%')
top_path_hbox3.children = [top_path_hour_text1, top_path_time_slider1]
top_path_submit = widgets.Button(description='Plot Top Paths!')
top_path_submit.on_click(plot_both_top_paths)
top_path_percent_checkbox = widgets.Checkbox(description='Display in percent: ', value=False, width=40)
top_path_hbox4 = widgets.HBox(height='40px',width='100%')
top_path_hbox4.children = [top_path_submit, top_path_percent_checkbox]
top_path_building_text2 = widgets.Latex(value='Select Building One:', width='20%')
top_path_building_dropdown2 = widgets.Dropdown(options = list_buildings, height='25px')
top_path_hbox5 = widgets.HBox(children=[top_path_building_text2, top_path_building_dropdown2], width='100%', height='50px')
top_path_weekday_text2 = widgets.Latex(value='Day of Week:', width='10%')
top_path_monday_checkbox2 = widgets.Checkbox(description = 'Monday: ', value=False, width='10%')
top_path_tuesday_checkbox2 = widgets.Checkbox(description = 'Tuesday: ', value=False, width='10%')
top_path_wednesday_checkbox2 = widgets.Checkbox(description = 'Wednesday:', value=False, width='10%')
top_path_thursday_checkbox2 = widgets.Checkbox(description = 'Thursday: ', value=False, width='10%')
top_path_friday_checkbox2 = widgets.Checkbox(description = 'Friday: ', value=False, width='10%')
top_path_saturday_checkbox2 = widgets.Checkbox(description = 'Saturday: ', value=False, width='10%')
top_path_sunday_checkbox2 = widgets.Checkbox(description = 'Sunday: ', value=False, width='10%')
top_path_hbox6 = widgets.HBox(height='40px',width='100%')
top_path_hbox6.children = [top_path_weekday_text2, top_path_monday_checkbox2, top_path_tuesday_checkbox2, top_path_wednesday_checkbox2,
top_path_thursday_checkbox2, top_path_friday_checkbox2, top_path_saturday_checkbox2, top_path_sunday_checkbox2]
top_path_hour_text2 = widgets.Latex(value='Enter Time Range:', width='20%')
top_path_time_slider2 = widgets.IntRangeSlider(min=0,max=23,step=1,value=(0,23), width='80%')
top_path_hbox7 = widgets.HBox(height='40px',width='100%')
top_path_hbox7.children = [top_path_hour_text2, top_path_time_slider2]
top_path_submit = widgets.Button(description='Plot Top Paths!')
top_path_submit.on_click(plot_both_top_paths)
top_path_percent_checkbox = widgets.Checkbox(description='Display in percent: ', value=False, width=40)
top_path_hbox4 = widgets.HBox(height='40px',width='100%')
top_path_hbox4.children = [top_path_submit, top_path_percent_checkbox]
top_path_tab1 = widgets.VBox(children=[top_path_hbox1, top_path_hbox2, top_path_hbox3])
top_path_tab2 = widgets.VBox(children=[top_path_hbox5, top_path_hbox6, top_path_hbox7])
top_path_tab = widgets.Tab(children=[top_path_tab1, top_path_tab2])
top_path_tab.set_title(0, 'Plot1')
top_path_tab.set_title(1, 'Plot2')
display(top_path_tab)
display(top_path_hbox4)
<a id = '3.2'></a> From this interactive interface one is able to compare the hourly connections for up to three different buildings on various days.
On each plot tab just select the values you want to plot. Select the building from the dropdown bar, the day of the week from the slider, and the color from the dropdown bar at the bottom. Then select which plots you want to include and hit the "Display Plots!" button.
The example below is the hourly connections for Gables A, Gables B, and Gables C for saturday. As you can see from this chart the number of connections for Gables C is significantly lower than Gables A, and Gables B, especially at peak party hours of 12:00 AM to 2:00 AM showing that it might be more beneficial for RA's to focus more on those two towers.
There are many applications to these charts because they essentially show student activity in any building.
def plot_connection_scatter(b):
clear_output(wait=True)
connection_scatter_plots, connection_scatter_labels = [], []
if connection_scatter_plot1_check.value==True:
connection_scatter_my_plot1 = plot_connections(connection_scatter_building_dropdown1.value,
connection_scatter_day_slider1.value, connection_scatter_color_dropdown1.value)
connection_scatter_plots.append(connection_scatter_my_plot1)
connection_scatter_label1 = str(connection_scatter_building_dropdown1.value)+ ' For '\
+ weekday_dict[connection_scatter_day_slider1.value]
connection_scatter_labels.append(connection_scatter_label1)
if connection_scatter_plot2_check.value==True:
connection_scatter_my_plot2 = plot_connections(connection_scatter_building_dropdown2.value,
connection_scatter_day_slider2.value, connection_scatter_color_dropdown2.value)
connection_scatter_plots.append(connection_scatter_my_plot2)
connection_scatter_label2 = str(connection_scatter_building_dropdown2.value)+ ' For ' \
+ weekday_dict[connection_scatter_day_slider2.value]
connection_scatter_labels.append(connection_scatter_label2)
if connection_scatter_plot3_check.value==True:
connection_scatter_my_plot3 = plot_connections(connection_scatter_building_dropdown3.value,
connection_scatter_day_slider3.value, connection_scatter_color_dropdown3.value)
connection_scatter_plots.append(connection_scatter_my_plot3)
connection_scatter_label3 = str(connection_scatter_building_dropdown3.value)+ ' For ' \
+ weekday_dict[connection_scatter_day_slider3.value]
connection_scatter_labels.append(connection_scatter_label3)
plt.legend(connection_scatter_plots, connection_scatter_labels)
connection_scatter_text1 = widgets.Latex(value='Select Building:', width='10%')
connection_scatter_building_dropdown1 = widgets.Dropdown(options = list_buildings, height='25px')
connection_scatter_hbox1 = widgets.HBox(children=[connection_scatter_text1, connection_scatter_building_dropdown1],
width='100%', height='50px')
connection_scatter_text2 = widgets.Latex(value='Select Building:', width='10%')
connection_scatter_building_dropdown2 = widgets.Dropdown(options = list_buildings, height='25px')
connection_scatter_hbox2 = widgets.HBox(children=[connection_scatter_text2, connection_scatter_building_dropdown2],
width='100%', height='50px')
connection_scatter_text3 = widgets.Latex(value='Select Building:', width='10%')
connection_scatter_building_dropdown3 = widgets.Dropdown(options = list_buildings, height='25px')
connection_scatter_hbox3 = widgets.HBox(children=[connection_scatter_text3, connection_scatter_building_dropdown3],
width='100%', height='50px')
#--------------------------------------------------------------------------------------
connection_scatter_text4 = widgets.Latex(value='Select Day:', width='10%')
connection_scatter_day_slider1 = widgets.IntSlider(min=0, max=6, step=1)
connection_scatter_hbox4 = widgets.HBox(children=[connection_scatter_text4, connection_scatter_day_slider1],
width='100%', height='50px')
connection_scatter_text5 = widgets.Latex(value='Select Day:', width='10%')
connection_scatter_day_slider2 = widgets.IntSlider(min=0, max=6, step=1)
connection_scatter_hbox5 = widgets.HBox(children=[connection_scatter_text5, connection_scatter_day_slider2],
width='100%', height='50px')
connection_scatter_text6 = widgets.Latex(value='Select Day:', width='10%')
connection_scatter_day_slider3 = widgets.IntSlider(min=0, max=6, step=1)
connection_scatter_hbox6 = widgets.HBox(children=[connection_scatter_text6, connection_scatter_day_slider3],
width='100%', height='50px')
#--------------------------------------------------------------------------------------
connection_scatter_text7 = widgets.Latex(value='Select Color:', width='10%')
connection_scatter_color_dropdown1 = widgets.Dropdown(options = my_colors, height='25px')
connection_scatter_hbox7 = widgets.HBox(children=[connection_scatter_text7, connection_scatter_color_dropdown1],
width='100%', height='50px')
connection_scatter_text8 = widgets.Latex(value='Select Color:', width='10%')
connection_scatter_color_dropdown2 = widgets.Dropdown(options = my_colors, height='25px')
connection_scatter_hbox8 = widgets.HBox(children=[connection_scatter_text8, connection_scatter_color_dropdown2],
width='100%', height='50px')
connection_scatter_text9 = widgets.Latex(value='Select Color:', width='10%')
connection_scatter_color_dropdown3 = widgets.Dropdown(options = my_colors, height='25px')
connection_scatter_hbox9 = widgets.HBox(children=[connection_scatter_text9, connection_scatter_color_dropdown3],
width='100%', height='50px')
connection_scatter_tab1 = widgets.VBox(children=[connection_scatter_hbox1, connection_scatter_hbox4, connection_scatter_hbox7])
connection_scatter_tab2 = widgets.VBox(children=[connection_scatter_hbox2, connection_scatter_hbox5, connection_scatter_hbox8])
connection_scatter_tab3 = widgets.VBox(children=[connection_scatter_hbox3, connection_scatter_hbox6, connection_scatter_hbox9])
connection_scatter_tab = widgets.Tab(children=[connection_scatter_tab1, connection_scatter_tab2, connection_scatter_tab3])
connection_scatter_tab.set_title(0, 'Plot1')
connection_scatter_tab.set_title(1, 'Plot2')
connection_scatter_tab.set_title(2, 'Plot3')
display(connection_scatter_tab)
connection_scatter_button = widgets.Button(description='Display Plots!', width='10%', padding=10)
connection_scatter_button.on_click(plot_connection_scatter)
connection_scatter_plot1_check = widgets.Checkbox(description = 'Include Plot1', value=False, width=40)
connection_scatter_plot2_check = widgets.Checkbox(description = 'Include Plot2', value=False, width=40)
connection_scatter_plot3_check = widgets.Checkbox(description = 'Include Plot3', value=False, width=40)
connection_scatter_hbox10 = widgets.HBox(children=[connection_scatter_button, connection_scatter_plot1_check,
connection_scatter_plot2_check, connection_scatter_plot3_check],
width='100%', height='50px')
display(connection_scatter_hbox10)
<a id = '3.3'></a> From this interactive interface one is able to attain a great deal of information about the UNH campus through utilizing geographic plots of the campus.
This interface consists of three tabs that each provide a different function.
<a id = '3.3.1'></a> The first tab is the Path Plot tab. On this tab you type in the buildings you're interested in seperated by a comma and a space, the days of the week you're interested in, and select a time range with the slider. After you do that you hit the "Plot Paths!" button and a plot showing where people are going from your buildings is generated. From this plot one can visualize where people on campus are going at a certain date and time.
<a id = '3.3.2'></a> The second tab is the Heat Map tab. On this tab you just select the days of the week and a time range you're interested in, and then hit the "Plot Heat Map!" button. A heat map for the number of connections in every building is then generated. From this plot one could see what buildings have the most traffic at a certain time and date.
<a id = '3.3.3'></a> The last tab is the Iterative Path Plot tab. On this tab there are many options to select from. On the first row of options there are three dropdown boxes. The first dropdown box is the building you're interested in, the second dropdown box is the number of iterations you want, and the third dropdown box is for the number of starting paths you're interested in, this is used to make sure not too many people are being tracked in future iterations, but if you still want every path tracked there is a dropdown option 'all' at the bottom. The next two rows is for selecting the day's of the week and the time range you're interested in. The last row of options contains a dropdown box for the maximum number of buildings you're interested in seeing, a text box where you enter how many paths you want to see after the first iteration, and a button "Plot Iterative Paths" that generates the multiple paths. This function takes awhile to run because it displays a seperate plot for each iteration, also note that if 'all' is selected from the dropdown menu the max number of buildings and number of path options in the last row are ignored. From this plot you are able to track people's movements for multiple iterations.
def plot_path_map(b):
clear_output(wait=True)
btext_list = (text.value).split(', ')
weekday_list = [i for i in range(7) if hbox2.children[i+1].value==True]
time_range = start_time_slider.value
plot_path_lines(weekday_list, time_range,
btext_list)
def plot_heat_map(b):
clear_output(wait=True)
btext_list = (text.value).split(', ')
weekday_list = [i for i in range(7) if hbox2.children[i+1].value==True]
time_range = start_time_slider.value
plot_path_lines(weekday_list, time_range,
btext_list, heat_map=True)
def plot_iter_paths(b):
clear_output(wait=True)
weekday_list = [i for i in range(7) if hbox2.children[i+1].value==True]
time_range = start_time_slider.value
plot_iterative_path_lines(weekday_list, time_range,
dropdown_iter_building.value, int(dropdown_iter_number.value), start_num=dropdown_iter_building_number.value,
max_buildings=int(dropdown_iter_max_buildings.value), path_num=int(iter_text.value))
text_label = widgets.Latex(value='Enter Buildings to plot:', width='20%')
text = widgets.Text(description='', width='70%')
hbox1 = widgets.HBox(height='40px',width='100%')
hbox1.children = [text_label, text]
text_label2 = widgets.Latex(value='Day of Week:', width='10%')
monday_checkbox = widgets.Checkbox(description = 'Monday: ', value=False, width='10%')
tuesday_checkbox = widgets.Checkbox(description = 'Tuesday: ', value=False, width='10%')
wednesday_checkbox = widgets.Checkbox(description = 'Wednesday:', value=False, width='10%')
thursday_checkbox = widgets.Checkbox(description = 'Thursday: ', value=False, width='10%')
friday_checkbox = widgets.Checkbox(description = 'Friday: ', value=False, width='10%')
saturday_checkbox = widgets.Checkbox(description = 'Saturday: ', value=False, width='10%')
sunday_checkbox = widgets.Checkbox(description = 'Sunday: ', value=False, width='10%')
hbox2 = widgets.HBox(height='40px',width='100%')
hbox2.children = [text_label2, monday_checkbox, tuesday_checkbox, wednesday_checkbox, thursday_checkbox,
friday_checkbox, saturday_checkbox, sunday_checkbox]
text_label3 = widgets.Latex(value='Enter Time Range:', width='10%')
start_time_slider = widgets.IntRangeSlider(min=0,max=23,step=1,value=(0,23), width='80%')
hbox3 = widgets.HBox(height='40px',width='100%')
hbox3.children = [text_label3, start_time_slider]
path_plot_button = widgets.Button(description='Plot Paths!', width='10%')
submit_path_hbox = widgets.HBox(height='40px',width='100%')
submit_path_hbox.children = [path_plot_button]
path_plot_button.on_click(plot_path_map)
heat_button = widgets.Button(description='Plot Heat Map!')
submit_heat_hbox = widgets.HBox(height='40px',width='100%')
submit_heat_hbox.children = [heat_button]
heat_button.on_click(plot_heat_map)
building_range = '5 6 7 8 9 10 11 12 13 14 15 all'.split()
iter_building_latex = widgets.Latex(value='Select Building:', width='10%')
iter_number_latex = widgets.Latex(value='Select # Iterations:', width='15%')
iter_building_number = widgets.Latex(value='Select # Start Paths:', width='10%')
dropdown_iter_building_number = widgets.Dropdown(options=building_range, width='15%')
dropdown_iter_building = widgets.Dropdown(options=list_buildings, width='15%')
dropdown_iter_number = widgets.Dropdown(options=['2', '3', '4'])
iter_hbox = widgets.HBox(height='40px',width='100%')
iter_hbox.children = [iter_building_latex, dropdown_iter_building, iter_number_latex, dropdown_iter_number,
iter_building_number, dropdown_iter_building_number]
iter_build_num_latex = widgets.Latex(value='Select Max # Buildings:', width='15%')
iter_path_num = widgets.Latex(value='Enter # Paths:', width='10%')
dropdown_iter_max_buildings = widgets.Dropdown(options=building_range)
iter_text = widgets.Text(description='', width='20%')
iter_button = widgets.Button(description='Plot Iterative Paths!')
submit_iter_hbox = widgets.HBox(height='40px',width='100%')
submit_iter_hbox.children = [iter_build_num_latex, dropdown_iter_max_buildings, iter_path_num, iter_text, iter_button]
iter_button.on_click(plot_iter_paths)
path_tab = widgets.VBox(children=[hbox1, hbox2, hbox3, submit_path_hbox])
heat_tab = widgets.VBox(children=[hbox2, hbox3, submit_heat_hbox])
iterative_tab = widgets.VBox(children=[iter_hbox, hbox2, hbox3, submit_iter_hbox])
tab = widgets.Tab(children=[path_tab, heat_tab, iterative_tab])
tab.set_title(0, 'Path Plot')
tab.set_title(1, 'Heat Map')
tab.set_title(2, 'Iterative Path Plot')
display(tab)
<a id = '4'></a> Our results clearly showed that tracking WiFi connections on a university campus provides enough information to get a great general sense of how students are traveling around campus and where they are at given points throughout the day. We believe this data can be useful for university administrators, the public safety office and the university information technology department. This research and initial look into the student movement patterns will create a platform for which these stakeholders can possibly find answers to questions about the student body, or possibly find new quations to ask that they wouldn't have been able to answer prior to having this resource.
<a id = '5'></a>
Through the process of performing this research, we were successful in proving that the data the university is already collecting on its students is incredibly valuable for a smart campus type application.We set out to try and prove that we may be able to use this data rather than investing in extremely expensive technologies to perform the same task.
However, while we were succesful in provinng the usefulness of this data, our findings do have some limitations. For starters, it would be best if we could have access to a live stream of data, however, we are still waiting to hear back from the IT department to know if it is possible or if the access points only transmit the connection data at a set interval. Having access to a live strem would be very valuable for applications such as response to public safety concerns such as disasters of events such as school shootings. Being able to live heatmap the campus may give these first responders or public safety officers a better idea of what the situation is.
Additionally, working with devices other than strictly Apple devices is a clear next step. However, working to filter stationary devices while capturing devices which travel with students would be necessary. For an intial proof of concept, strictly working with apple is functional, however, being able to gain the extra insight provided by as much of the student body as possible would be helpful.
Lastly, a major limitation we had was that we did not have disconnect times in the data. This makes it impossible to know for sure if a student has left a building. This is an unfortunate limitation, however, through filtering data it is fairly easy to presume which students have left campus, assuming students aren't spending more than 9 hours in a classroom.
The next steps to this project are to create some very succinct dashboards which are specific to a desired use-case. We plan to work with departments at the university to create dashboards which can be helpful to thier needs and help usher the University of New Hampshire closer to being a smart campus.